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ABSTRACT 

We present a study of the group purchasing behavior of daily 
deals in Groupon and LivingSocial and introduce a predic- 
tive dynamic model of collective attention for group buying 
behavior. In our model, the aggregate number of purchases 
at a given time comprises two types of processes: random 
discovery and social propagation. These processes are very 
clearly separated by an inflection point. Using large data 
sets from both Groupon and LivingSocial we show how the 
model is able to predict the success of group deals as a func- 
tion of time. We find that Groupon deals are easier to pre- 
dict accurately earlier in the deal lifecycle than LivingSocial 
deals due to the final number of deal purchases saturating 
quicker. One possible explanation for this is that the incen- 
tive to socially propagate a deal is based on an individual 
threshold in LivingSocial whereas it in Groupon is based 
on a collective threshold, which is reached very early. Fur- 
thermore, the personal benefit of propagating a deal is also 
greater in LivingSocial. 
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1. INTRODUCTION 

Attracting the attention of potential customers in today's 
information rich social media is a challenge. As a result mar- 
keters have been forced to target customers in more sophis- 
ticated ways. Location-based (regional) and hyper-location- 
based (within eye-sight) targeting has turned out to be very 
effective in terms of improving conversion rates from views to 
purchases [10] . However, since people are unwilling to share 
their exact locations out of privacy concerns they need to be 
given some incentive to reveal their position. The most suc- 
cessful incentive employed to date is daily dealsQ In spite of 
the success of this strategy it is not fully understood what 
makes it successful and what kind of social behavior the daily 
deals sites so effectively tap into and exploit. However, it 
is clear that deadlines and social propagation play impor- 
tant roles in addition to location-based targeting. The main 
question we are addressing in this work is how to describe 
the purchasing pattern more precisely in order to predict the 
future popularity of a deal. 

We analyzed data from Groupon and LivingSocial, the 
current market leaders of daily deals in the US. Groupon 
promotes deals for different geographic markets, or cities, 
called divisions. In each division, there is typically one fea- 
tured daily deal. A deal is a coupon for some product or 
service at a substantial discount off the regular price. Deals 
may be available for one or more days. Coupons are only 
redeemable if a certain minimum number of customers pur- 
chases the deal, and this number constitutes what Groupon 
calls a tipping point. Furthermore, sellers may set a max- 
imum threshold size to limit the number of coupons that 
can be purchased. LivingSocial is similar to Groupon, ex- 
cept that there is no tipping point. The incentive that drives 
users to buy deals is the following commitment made by Liv- 
ingSocial: "Buy first, then share a special link with friends, 
if three friends buy, yours is free!", [j 

A closer examination of the mechanisms driving user be- 
havior in group deals could provide useful guidance for lo- 
cal marketing campaigns. In this paper we study the evo- 

^http: //www. bynd.com/2011/05/04/social- loco-research/ 
^http://www.livingsocial.com 



lution of collective attention measured as deal purchases. 
We base our analysis on data collected from Groupon over 
two months and from LivingSocial over one month. Our as- 
sumption is that successful deals arise from two behavioral 
processes: random discovery; resulting from the serendipi- 
tous discovery of a deal on the web portal, or in the mobile 
app, or via an email subscription; and social propagation; 
which results from the propagation of deals over social net- 
works. These processes are separated by an inflection point, 
which in Groupon is the tipping point, after which there are 
enough purchases to guarantee deal transactions. Before the 
inflection point is reached the customer base is small so the 
random discovery process dominates. Conversely, after the 
inflection point, a critical mass of customers have discovered 
the deal to make social propagation dominate the purchasing 
behavior. 

The contributions of this paper fall into two categories: 

• Structure of purchasing dynamics. We present 
a stochastic model that analytically explains the ob- 
served purchasing behavior. 

• Prediction model for purchases. We show how 
the model is able to predict the success of group deals 
as a function of time. 

The paper is structured as follows. In Section[2l we discuss 
related work. In Section O we discuss the data sets and the 
collection strategies used in our study. Section [4] describes 
our stochastic model and verify it empirically. Then in Sec- 
tion [S] we use our model to predict purchase volume and 
benchmark it against some baselines. Section [6] concludes 
with possible applications of our work and future directions. 

2. RELATED WORK 

The related work comes from two broad areas, social pur- 
chasing behavior, and collective attention. 

2.1 Social Purchasing Behavior 

According to 13 0, a buyer's social network strongly in- 
fluences her purchasing behavior. In [9], Guo et. al. ana- 
lyze data from the e-commerce site Taobaqj to understand 
how individuals' commercial transactions are embedded in 
their social graphs. In the study, they show that implicit 
information passing exists in the Taobao network, and that 
communication between buyers drives purchases. However, 
according to the study presented in [15] social factors may 
impose a different level of impact on the user purchase be- 
havior for different e-commerce products. 

Several studies have been conducted to understand various 
aspects of Groupon. In [l], Arahbshai examined the business 
model of Groupon, and concluded that its advantages is the 
economic potential to leverage simple technologies (e.g., web 
portal and email subscription) to address deeply embedded 
inefficiencies in life. In [B], Utpal conducted a survey-based 
study on Groupon, in order to understand how businesses 
fare when running group promotions. Employee satisfaction, 
rather than features of the promotion or its effect, was found 
to be the factor that correlates most strongly with the profit 
gained from a promotion. Effectiveness in reaching new cus- 
tomers and the percentage of Groupon users who bought 

^Taobao is a Chinese Consumer Market place, 
and also the world's largest e-commerce website, 
http://www.taobao.com 



more than the deal's value during the visit were important 
factors for the small merchants when considering whether to 
run another promotion. In [S. , Grabchak et al. study the 
problem of selecting Groupon style chunked reward Ads. To 
address the problem, they devise several adaptive greedy al- 
gorithms in a stochastic Knapsack framework. 

The paper most related to our work is [4j, where data on 
the purchase history of Groupon deals were analyzed. One 
key outcome of [^ is the preliminary evidence that Groupon 
is behaving strategically to optimize deal offerings, giving 
customers "soft" incentives (e.g., deal scheduling and dura- 
tion, deal featuring, and limited inventory) to make a pur- 
chase. Our work differs from these studies by focusing on 
modeling the deal purchasing dynamics over time and by 
highlighting the importance of the tipping point and its im- 
plication to social propagation. 

2.2 Collective Attention 



In [1311121 [T4] . Lerman et. al, propose to use a stochastic 
model to describe the social dynamics of web users, with 
Digg as a case study. The stochastic model focuses on de- 
scribing the aggregated (by average quantities) behavior of 
the system, including average rate at which users contribute 
new stories and vote on existing stories. With the devised 
stochastic model, popularity of a Digg story can be predicted 
shortly after it was submitted (or with 10 to 20 votes). Stud- 
ies in [111 (31 [5] have found that early diffusion of information 
within a community could be a good predictor of how far it 
will spread. 

Recent studies of collective attention on social media sites 
such as Twitter, Digg and YouTube [1711161 (2] have clarified 
the interplay between popularity and novelty of user gener- 
ated content. The allocation of attention across items was 
found to be universally log-normal, as a result of a multi- 
plicative process that can be explained by an information 
propagation mechanism inherent in all these sites. While 
the specific time scales over which novelty decays differ be- 
tween different systems depending on their typical type of 
content, the functional form of the decay is consistent and 
thus future popularity is predictable. 

3. DATASETS 

We collected data from Groupon's socially promoted and 
local daily deal websites in the US. We also collected data 
from LivingSocial to verify that our models could be applied 
more generally across group deal sites. 

Groupon provides a convenient APJj, which allows us to 
obtain more detailed information about the deals. By the 
end of April 2011, Groupon's business covered about 120 
cities in the Uqj. We monitored all Groupon deals offered 
in 60 different randomly selected cities during the period 
between April 4th and June 16th, 2011. In total we collected 
the entire purchase traces of 4349 deals. 

In LivingSocial, there is no API available for us to pe- 
riodically obtain information about deals, so we developed 
a crawler to visit the webpages of deals periodically. After 
crawling for one month, we collected traces from over 900 
deals. 

Next, to give a flavor of the type of data being used we 
examine the features of Groupon deals in more detail. A 



''http://www.groupon.com/pages/api 
^Statistics obtained from Groupon API. 



Description 


coefficient 


standard error 


t- value 


p- value 


Intercept 


-4.094 X 10"^ 


5.9776 X 10'^ 


-0.6849 


0.4935 


Tipping Point 


0.7316 


0.029 


25.2276 


6.5792 X io-i25(***) 


Featured position 


0.7004 


0.0463 


15.1189 


2.0166 X io-«(***) 


Duration 


0.0062 


4.8862 X 10"* 


12.6412 


1.6054 X io-35(***) 


is limited or not 


-2.6105 X 10"'' 


2.0969 X 10"^ 


-12.4494 


1.5597 X io-34(***) 


Retail Price 


- 0.0082 


0.0458 


-0.1797 


0.8574 


Discount 


-0.0011 


1.6681 X 10~* 


-6.3744 


2.1908 X io-iO(***) 


Sunday 


0.0061 


0.0022 


2.7358 


0.0063 (***) 


Nightlife 


0.3208 


0.1515 


2.1180 


0.0343 (*) 


Health&Fitness 


0.6429 


0.0849 


7.5722 


5.1827 X io-"(***) 


Travel 


-0.1789 


0.0782 


-2.2874 


0.0223 (*) 


Automotive 


-0.3289 


0.1366 


-2.4074 


0.0161 (*) 


Professional Services 


0.2552 


0.1390 


1.8363 


0.0664 


atlanta 


-2.0460 


0.9373 


-2.1829 


0.0291 (*) 


albuquerque 


-1.8548 


0.9365 


-1.9806 


0.0478 (*) 


austin 


-2.4329 


0.9516 


-2.5567 


0.0106 (*) 


abbotsford 


-2.1012 


0.9392 


-2.2371 


0.0254 (*) 


barrie 


-2.2454 


0.9496 


-2.3646 


0.0181 (*) 



Table 1: Multivariate linear regression of number of purchases. A'^ = 3876, i?-square = 0.5952, adjusted 
_R-square = 0.5857. Note that, due to space limitation, we only show the result with p-value smaller than 5% 
for the launching day, category and division study. 



similar examination for LivingSocial is outside the scope of 
this work. However, we will later see that the models in- 
ferred from these observations apply to LivingSocial as well. 

3.1 Groupon Deal Characteristics 

At the time of our study the Groupon website presented 
the following relevant deal information: description, dis- 
count, time of launch, tipping point (purchases required for 
a deal to actually be sold), and the maximum number of 
sales of the deal. Additionally, users could monitor the cur- 
rent number of purchases [j and whether the deal has tipped 
or sold out. We monitored the number of purchases and 
the position of each deal in 20-minute time intervals. A sur- 
prisingly large portion (10%) of all deals exhibited dramatic 
non-monotonically increasing behavior, e.g. a decrease of 10 
purchases between subsequent intervals. This may indicate 
that something was wrong with the deal, e.g. false market- 
ing due to an inflated list price, and customers who initially 
purchased the deal requested a refund (an option Groupon 
supports and markets). Due to the unknown user behavior 
behind these deal actions we exclude these deals from our 
study. Hence, 3876 deals were left to analyze. In our dataset, 
270 deals (out of 3876) had not reached their tipping point 
when they expired. In the following, these deals are called 
failed deals; and deals that are turned on successfully are 
called tipped deals. 

3.1.1 Attributes of Deals 

Here we present some statistics about attributes of the 
deals in our Groupon dataset, including retail price, dis- 
count, deals needed to tip (tipping point), time needed to 
tip (tipping time), lifetime of a deal and final number of 
purchases. 



The current number of purchases has since our study been 
removed and replaced with an obfuscated threshold to make 
it harder to make predictions. 



Groupon deals have different retail prices and discounts. 
The mean value of retail price is $44 and the mode value is 
$10. We observe that most of the discounts range from 50% 
to 60%, and the mode value is 50%. Based on these statis- 
tics, we see that the product and services deals provided on 
the Groupon website are not expensive most of the time, 
and the discounts are usually very big. 

In Groupon, deals may have different tipping points and 
successful deals may also have different tipping times even 
when they have the same tipping points. The average num- 
ber of tipping points or units needed to tip is 22 (mode value 
is around 10) and the expected tipping time is about 10.5 
hours (mode value is around 6.67 hours). Most of the time, 
deals in Groupon were tipped within one day. 

Note that the lifetime of a deal in Groupon is usually set 
to 1 day, 2 days, 3 days or 4 days. The average number of 
purchases of a deal is 373. A deal may be specified with a 
limited available quantity. So these numbers are mixtures 
of different factors, such as the quality of a deal itself, the 
quantity available etc. 

3. 1.2 Factors Impacting Purchases 

As we are ultimately interested in modelling purchase dy- 
namics of deals, we first need to understand what factors im- 
pact purchases. Hence, we regress the attributes discussed in 
the previous section against the final number of purchases of 
a deal. If the Groupon commission is knowrQ, this number 
also gives a good estimate of the merchant's revenue from a 
deal. 

The model we use is as follows. Let Nl denote the final 
number of purchases, G the number of purchases needed to 
tip (tipping point), / whether the deal is listed in featured 
position (1) or not (0) at the current time, L the time till 
the A'^L-th purchase, p the retail price, d the discount, and 
finally I whether the deal inventory is limited (1) or not (0). 



■^reportedly 50% in [T] 



The parameters w, c and g are vectors encoded as in [4] to 
represent the launch day, category, and city. The following 
equation is also taken from [1]. 



log Nl=/3q + /3i log e + Pif + I33L + Pil + /35P + /?6d 

+;s7w+;5gc+;fl9g 



(1) 



where /3o ~ Ps are the coefficients of the linear model. 

We fitted the model using multivariate linear regression. 
The parameter estimates, their standard errors, t- values and 
p- values are listed in Table[T] Due to space limitations, only 
attributes with significance level (p-value) smaller than 5% 
are shown in the table. Among those attributes, we find that 
tipping point and featured position are the two most signif- 
icant factors that can help predict the number of purchases. 
Surprisingly, tipping point seems to have better predicting 
power than featured position (i.e., the t- value is much larger 
for the tipping point factor than for the featured position 
factor). In the next section, we show how the tipping time 
can be generalized as an inflection point in the purchase 
dynamics of group deals. 

4. PURCHASE DYNAMICS 

In this section, we propose a model of the purchase dy- 
namics of group deals. A group deal is generally discovered 
by the user in one of the following four ways: (1) by visiting 
a web-page, (2) by running a smart-phone application, (3) 
by getting notifications via email and (4) by communicating 
with friends. Here, we refer to the first three as random 
discovery and the fourth is referred to as social propagation. 

Based on this notion, our model describes the purchase 
dynamics as follows. Let Nt denote the number of times 
that the deal has been purchased at time t. We then have 



Ni 



t+At 



Nt = afYt+Pffit,Nt), 



(2) 



where at and /?t are weight factors, Yt is a non-negative 
random variable denoting the number of purchases caused 
by random discovery in the interval (t, t -\- Af], and f{t, Nt) 
represents the number of purchases caused by social propa- 
gation in the same interval as a function of t and Nt . 




Time (hour) 



(a) Groupon 



(b) LivingSocial 



Figure 1: Purchase growth of deals 

We average the number of purchases of deals for each time 
step in both Groupon and LivingSocial. As shown in Fig- 
ure IT]), deals in LivingSocial grow faster than Groupon in 
the first few hours. A possible reason is due to the different 
incentive that LivingSocial is using to promote deals. Liv- 
ingSocial users who want to get free deals may disseminate 
deal information more eagerly. 

Furthermore, there is an inflection point in the purchase 
dynamics for both Groupon and LivingSocial deals (after 
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Figure 2: Normalized Purchase growth of deals 



around 7 and 4 hours in Figure [TJ a) and (b), respectively), 
after which the number of purchases grows faster; whereas 
the number of purchases grows relatively slowly and steadily 
before the inflection point. Note that after the inflection 
point, the number of purchases grows dramatically for about 
11.6 and 14.8 hours in Groupon and LivingSocial, respec- 
tively, after which the purchase rate drops. 

One may argue that this inflection point could be caused 
by time-of-day seasonality given that all deals are local for 
a region belonging to a single time zone. For example, most 
people do not buy deals at night, but early in the morn- 
ing when they wake up. Hence, we normalize the number 
of purchases by removing the seasonal impact to examine 
whether the inflection point is caused by the time the deal 
is launched, as shown in Figure [2] In Groupon, 95% of 
the deals are launched before 7:00am and 50% of these are 
launched between 4:00am and 6:00am. Hence, we cluster 
deals in three groups, those that launch around 4:00am, 
5:00am, and 6:00am respectively. As shown in Figure [S^a), 
normalized purchase growth of deals clearly has two-stage 
growth, which is divided by a inflection point. Before the 
inflection point, it shows non- linear growth; while after the 
inflection point, it obeys linear growth. In LivingSoical, 
deals are launched during 4:00am~6:00am, like Groupon. 
Interestingly, in Figure [U^b), we flnd the inflection point 
in the purchase growth of LivingSocial deals disappears af- 
ter the normalization. In addition, deals launched from the 
same time (e.g., from 4:00 am) exhibit different purchase 
dynamics behavior in Groupon and LivingSocial, e.g., in 
Figure (2] the purchase dynamics of Groupon deals still ex- 
hibit an inflection point, while there is none in LivingSocial 
deals. These observations suggest that: (1) the consistent 
launch times may cause the two-stage purchase growth in 
LivingSocial; but (2) the inflection point cannot solely be 
attributed to the time the deal is launched in Groupon, but 
the tipping-point mechanism may also play a role here. 

Based on the above observations we write our equation as: 



Nt+At - Nt 



Yt 



before the inflection point 



r{t)XtNt after the inflection point 



(3) 



Thus, we are implicitly assuming that before the inflection 
point at = 1 and Pt = 0, whereas after the inflection point 
at = and /3t = 1 in ((2)1. This assumption is motivated 
by the fact that random discovery dominates before the in- 
flection point and social propagation dominates afterwards 
— even though the two processes may coexist. In partic- 
ular, before the inflection point the customer base is small 
so the random discovery process dominates. In addition, in 
Groupon, before the deal has tipped, people will hesitate to 



make a purchase, as it is still uncertain both whether the 
deal was considered good by others and whether it will be 
offered, which reduces the effects of social propagation. Af- 
ter the inflection point both of these uncertainties are gone. 
According to ^, after the inflection point, the increase 
in the number of purchases {Nt+At — Nt) is proportional 
to the number of people that has purchased the deal up to 
time t. Intuitively, a fraction of the people that already 
purchased the deal will notify some of their friends about it, 
and a fraction of these friends will purchase the deal. These 
fractions are represented by the positive random variable 
Xt- We assume that {Xt} are independent and identically 
distributed random variables. Since Xt is assumed to be 
positive, Nt can only increase over time. This growth in 
time is eventually curtailed by a decay in novelty, which is 
parameterized by the factor r{t). As we discuss later, r{t) 
is decreasing in t. This notation of social propagation is 
borrowed from and motivated in more depth in [17) . 

4.1 Purchase Dynamics Before Inflection 

We denote by n the interarrival times of purchases. In 
particular, n is the time between the i — 1 and the i-th 
purchases. Suppose that each t; is independently drawn 
from some distribution F. We denote a deal's inflection 
point by 0, that is, the number of purchases required before 
social propagation dominates. Let L be the total time that 
the deal is open for purchases (as set by the seller). Then, 
Nl is the final number of purchases when the deal ends. 

Let Fn denote the n-fold convolution of F. Then, F„ 
is the distribution of the sum of n consecutive interarrival 
times. Thus, the distribution of the time span to get the 
same inflection point 9 for deals is given by Fg, the 6'-fold 
convolution. 
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Figure 3: Purchase growth for deals with tipping 
(inflection) points of 10 and 20, respectively, in 
Groupon 

Figure [3] shows how the number of Groupon deal pur- 
chases increases over time when the tipping point is equal 
to 10 (the most frequent value) and 20 purchases. The plot 
is based on 492 (resp. 477) deals whose tipping point was 
equal to 10 (resp. 20) in our dataset. We observe the same 
pattern for deals with other tipping points, e.g., 5 and 30. 
We flnd an approximately linear growth of purchases at the 
beginning of the lifetime of a deal. For both tipping points, 
the purchase rate is relatively small and steady before the 
tipping time. After tipping or around tipping, the number 



of purchases grows dramatically for about 11.6 hours, af- 
ter which the purchase rate drops. The tipping point time, 
thus, typically coincides with the inflection point time in the 
purchase dynamics. 

Note that the final number of purchases of a deal with 
a tipping point of 10 purchases is usually smaller than the 
corresponding number for a deal with a tipping point of 20, 
even though we do not observe a significant difference before 
the tipping times. One possible reason is that deals tipping 
after 10 purchases have smaller purchase populations than 
those that tip after 20 purchases, depending on the specific 
categories of products and services. Furthermore, the po- 
tential purchase population may also act as the reference 
for Groupon and local merchants when they set the tipping 
point for a deal. 

We now look at the probability that a deal fails, i.e. does 
not reach the infiection point. We say that a deal is turned 
on as long as its number of purchases reaches the infiection 
point 6 before the deal expires, i.e. its lifetime L ends. So 
the probability of a deal failing is equal to Pr{NL < 9). 



Pr{NL <0) = Y1 P''(^^ = ") 



(4) 



Since the Ti variables are iid interarrival times of pur- 
chases, it follows that this is a renewal process. We use 

n 

S„ ~ "^ Ti to denote the time spent until the nth purchase. 

i = l 

It is easy to see that Nt = supjn : Sn < t}, and thus. 



Pr(iVt = n) 
= Pr{Nt >n)~ Pr(iVt > n + 1) 
= Pr{Sn<t)-Pr(S„+i <t) 
=F„(t)-F„+i(t) 

Applying Equation © to Equation ([3|, we have: 

Pr(iVL < 9) 

= ;^(F„(L)-F„+i(L)) 

n = l 

=F{L)-Fe(L) 



(5) 



(6) 



Note that Equation ((6| can predict the failure ratio (i.e., 
the probability not to be turned on) of a deal. Conversely, 
using this equation, given the failure ratio, we can estimate 
the parameters of F, such as the mean value. 

This analytical model can be easily extended to predict 
the probability that a deal will be turned on when we know 
the number of purchases up to a given point in time. For 
example, if at time f i , a deal has already got ni purchases, 
then the probability that the deal will be turned on can be 
estimated as 

Pr(iVL <9\Nt, =ni) = F(L-ti)-Ffl_„,(L-fi) (7) 

We now consider what distribution the interarrival times 
follow in Groupon. To exclude the impact of tipping point 
differences, we first consider only deals with a tipping point 
of 10 purchases (the tipping point distribution mode) from 
all the data we gathered. As shown in Figure [4l interarrival 
times follow an exponential distribution. Thus, before tip- 
ping, the arrival rate of purchases follows a Poisson process. 
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gathered online, the histogram and PDF (curved line) of the 
empirical and modelled distributions respectively are shown 
in Figure [S] Note that there are some deals in Groupon 
that are very appealing, and thus were tipped immediately 
after they were launched. Nevertheless, the predicted tip- 
ping time distribution of Groupon deals is similar to the 
empirical one. 

4.2 Purchase Dynamics After Inflection 

We now focus on the dynamics after the inflection point, 
and for expositional clarity consider the time of inflection 
as time 0. Thus, No denotes the number of purchases of a 
deal at the inflection point time. Then, according to Equa- 
tion ((Sl , the number of purchases at time T (that is, T time 
units after the inflection point) is given by 



Figure 4: Distribution of waiting time for a pur- 
chase. This result is based on all deals with a tipping 
point of 10 purchases, in Groupon 



This observation conflrms our assumption about random 
discovery, since if a user randomly checks the websites or a 
smartphone app the probability of a purchase taking place 
in the next infinitely-small time interval is the same, and 
hence the intervals between purchases follow an exponential 
distribution. The Exponential fit in Figure [4] has R^ value 
0.9784. We also check the interarrival times of purchases 
in LivingSocial during the first 4 hours, and find that in- 
terarrival times in LivingSocial also follow an exponential 
distribution. 



NT = Y[{l + r{t)Xt)No 



(8) 



Note that the realization of Xt will in general be different 
in different time periods; however all random variables Xt 
follow the same distribution. When Xt is small (which is the 
case for small time steps), we have the following approximate 
solution for Nt'- 



Nj 



ne'-W^'iVo = e2:"=i'-W^*iVo. 



Taking the logarithm on both sides, we get 



log Nt - log No^Y^ r{t)Xt 



(9) 



(10) 
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Figure 5: Predicted tipping time distribution vs. 
empirical tipping time distribution. The result is 
based on all deals with tipping point equal to 10, in 
Groupon 

An important conclusion from our model is that the dis- 
tribution of tipping time in Groupon is expected to follow an 
n-fold convolution of distributions of F{t). Now, given that 
F{t) is following an exponential distribution, then deals with 
a tipping point of 10 purchases should follow a Gamma dis- 
tribution with a shape factor equal to 10. We compare the 
predicted distribution of tipping time with that of real values 
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Figure 6: Process of novelty decay 

The decay factor r{t) is estimated according to Equation 
(Is} and Equation (|10[) as follows: 



rit) 



E(logJVt)-E(logiVt-i) 
E(logiVi)-E(logAro) 



(11) 



where we normalize r(l) to 1. This calculation is again bor- 
rowed from and evaluated in more detail in [17) . 

In Figure [S] we plot the novelty decay r{t) for the first 
16 and 20 hours after the inflection point in Groupon and 
LivingSocial, respectively, as estimated from our dataset. 
Note that tipping time is usually around 8 hours, so we focus 
on the time duration of 16 hours after tipping in Groupon. 
Recall that in this section A^o denotes the tipping point, 
and time t = is the tipping time. We observe that r{t) 
decreases over time. Moreover, Figure [6l suggests that the 
novelty decay is exponential. In particular. 



r{t) « exp(ot + b), 



(12) 
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Figure 7: Empirical verification of our model 



where in Groupon a = —0.21 and b = —2, and the R^ value 
for this fit is 0.8839; and in LivingSocial a = —0.11 and 
b = -0.28 and R^ value for this fit is 0.9190. 

Next, we are interested in evaluating how well our model 
helps explain the purchase growth after a deal has turned on. 
With both a, b estimated, we can use our results to explain 
the growth of purchases. In Figure [T] we demonstrate the 
potential predictive power of our model by empirically veri- 
fying the growth of purchases of deals after they have tipped. 
For the model fitting in Figure[71 the i?^ value is 0.9404 and 
0.9903 in Groupon and LivingSocial, respectively. 

5. PURCHASE PREDICTION 

In this section, we discuss how to use our models to predict 
the number of purchases of deals at a given time. Purchase 
prediction is important for both group deal websites and lo- 
cal merchants. Accurate forecasts may help group deal web- 
sites design more optimized deal scheduling and promotion 
strategies and aid local merchants in allocating resources 
more efficiently. 

We now discuss methods which make predictions based 
on h hours of previous observations. 

5.1 Predictors 

5.1.1 Baselines 

The first simple baseline algorithm (denoted as baselinel) 
is to treat the current number of purchases as the future 
number of purchases, and hence it guarantees less than 100% 
relative error, given that the number is increasing and always 
positive. 

Another baseline algorithm (denoted as baseline2) is to 
assume a linear relationship between the current number 
of purchases and the future number of purchases. Suppose 
we know the number of purchases Nt^ at time ii, and aim 
to predict the number of purchases Nt2 at time t2, where 
ii < t2. Then we assume that 



Nt, = aNt, + /3 



(13) 



where a and /3 is model parameters that can be learned from 
training data. 

5.1.2 Social Propagation Model 

As seen in Figure [T] the growth in sales after tipping in 
Groupon is described well by a multiplicative process. What 
follows from the model is that to obtain the popularity for 
the next time step we multiply the current popularity by 
a small, random amount. More specifically, let ti and t2 
denote two different time steps and ii < i2. Following [16) . 



logiV,, ^log(iVtJ + ^r(t)Xt 



(14) 



according to Equation @ 

This process, called "growth with random multiplicative 
noise", describes the dynamics of users' attention to web 
contents [17) . While the increments at each time step are 
random, their expected value over many time steps adds up 
ultimately to X^t^t ^('')^t i^ the log-linear model, where 
X]t=t r{t)Xt accounts for the linear relationship between 
the log-transformed popularities at different times ti and 
i2. 

Here, we introduce the process used to model and predict 
the future number of purchases of a deal. We first perform 
a logarithmic transformation on the number of purchases, 
similar to [161 |3j. To help determine whether the number 
of purchases early on is a predictor of later number of pur- 
chases, see FigureO which shows the number of purchases at 
the reference time ti = 8 hours vs. the number of purchases 
at the end of a day (i.e., i2 = 24 hours) in both Groupon and 
LivingSocial. We logarithmically rescaled the horizontal and 
vertical axes in the figure to show the number of purchases 
for different deals, which span four orders of magnitude. 





# of purchase after 8 hours 

(a) Groupon 



# of purchases after 8 hours 

(b) LivingSocial 



Figure 8: Number of purchases after 8 hours vs. 
number of purchases after one day (log-scale). The 
bold line is the linear fit to the data 

Figure ((8| shows that there is a strong correlation be- 
tween the earlier observations of the number of purchases of 
a deal and the later observations. So we can determine the 
linear regression coefficients between ii and ^2 on a given 
training dataset, and then use the estimated coefficients to 
extrapolate on the test dataset. 

Note that there is a limitation to this approach. As we 
discussed before, in Groupon a renewal process, rather than 
a multiplicative one, governs the dynamics before tipping. 
So this approach may not perform well for the very early 
observations. Nevertheless, it is applicable to both Groupon 
and LivingSocial since the multiplicative process is the main 
process during the life cycle of a deal for both services. 

5.2 Evaluation 

In this subsection, we conduct an experimental study to 
evaluate the proposed prediction algorithms. As discussed 
before, the important task is to be able to predict how suc- 
cessful a deal will be. Since there are many deals with a 
lifetime of one day we evaluate the performance of different 
algorithms by how accurately they can predict the number 
of purchases of a deal after one day. Here, we use relative 



error, i.e. 



[real purchases - predicted purchases] 
real purchases 



as the per- 
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(d) Social Propagation (SP) Model 
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Figure 9: Performance comparison of prediction of the number of purchases after one day in Groupon. In 
(a)-(e), lines denote the average relative error, and shaded regions cover the areas of one-standard error. 



Deal Title: The Magnetic Field - Asheville] $12 for Two Tickets to a Theater Performance (Up to $28 Value) | 


Algorithms 


Real purchases 


Predicted purchases 


Relative error 


baseline-1 12-hour observation 


251 


93 


0.63 


baseline-2 12-hour observation 


251 


482 


0.92 


MLR 


251 


51 


0.80 


SP with 12-hour observation 


251 


355 


0.42 


Deal Title: Lime Leaf Thai Cuisine - Hendersonvillc] $10 for $20 Worth of Thai Fusion Cuisine | 


Algorithms 


Real purchases 


Predicted purchases 


Relative error 


baseline-1 12-hour observation 


384 


169 


0.56 


baseline-2 12-hour observation 


384 


714 


0.86 


MLR 


384 


1,452 


2.783 


SP with 12-hour observation 


384 


463 


0.21 



Table 2: Example prediction results for Groupon deals. 



formance metric to measure accuracy. 

5.2.1 Experiments with Groupon Deals 

First, we conduct experiments on the Groupon dataset by 
randomly splitting it into halves, where one half is used for 
training and another half is for testing. 

In Figure |9l we find baselinel shows the best perfor- 
mance among all the testing algorithms with less than 7- 
hours of observations. After 7-hour observation, our pro- 
posed social propagation model (denoted as SP) shows the 
best performance. Note that a deal which attracts more 
than hundred purchases within the first hour after launch- 
ing (6 deals in total in the experiment) is treated difi'erently 
by applying baselinel, as these deals are extremely popular 
and don't follow the general multiplicative process. The jus- 
tification for applying baselinel is that, these deals are so 
appealing that local merchants usually place quantity limits. 

As we observed before, deals in Groupon are usually tipped 



after about 7 hours. Before tipping, the purchase dynam- 
ics is governed by random discovery instead of the multi- 
plicative process, thus the social propagation model fails to 
achieve good performance. However, we find that there is 
an infiection point which occurs at about 7 hours. After 7 
hours of observations, the social propagation model exhibits 
relatively good performance, and it performs much better 
with more hours of observation. In Figure [5] (f), relative 
error distributions of baselinel and SP with 12-hour ob- 
servation are examined. We find that the relative error is 
less than 50% for over 90% of deals when using SP, and there 
are about 70% of deals achieving less than 20% relative error 
when applying SP. 

In the experiment, we incorporated all the attributes of 
the deals into the multi-linear regression (denoted as MLR) 
model, including the tipping point. Tipping points can be 
considered as the observation of the number of purchases at 
around 6-8 hours. Therefore, as shown in Figure |9jf), the 
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Figure 10: Performance comparison of prediction of the number of purchases after one day in LivingSocial.In 
(a)-(d), Hnes denote the average relative error, and shaded regions cover the areas of one-standard error. 



multi-linear regression model achieves a comparable perfor- 
mance with our model within an observation period of 6 
hours. To exemplify the prediction accuracy, we show the 
results from a few Groupon deals in Table [2] 

As a refinement for Groupon deals, we perform baselinel 
if the deal has not tipped; otherwise, we apply the social 
propagation (SP) model. 

5. 2. 2 Experiments with LivingSocial Deals 

We conducted similar experiments on the LivingSocial 
dataset. As shown in Figure 1101 our social propagation 
model (SP) always outperforms baseline2 and beats base- 
linel with more than 2-hours of observations. Because of 
the limitations of the crawling technique, we do not have 
information about which deal is the featured one in a given 
city; and there is no tipping point in LivingSocial, which 
prevents the multi-linear regression model from generating 
good predictions. However, the social propagation model 
shows very good performance in LivingSocial. In particular, 
we examine the distribution of relative errors for predictions 
based on SP and baselinel with 12-hours of observations in 
LivingSocial. As shown in Figure fTOT e). we find that there 
are about 65% of deals with less than 50% relative error; 
and SP always outperforms baselinel. 

Similarly, we show prediction results from some Living- 
Social deals in Table [S] As shown in Table [3l the social 
propagation model exhibits better prediction performance 
than both baselines, in terms of relative error. 

Finally, our design for purchase prediction of Groupon 
deals is that we perform baselinel if with less than 3-hour 
observation; otherwise, we apply the social propagation (SP) 
model. Note that due to different mechanisms in Groupon 
and LivingSoical, infiection points are placed at very dif- 
ferent times (i.e., 6-8 hours in Groupon, and 2-4 hours in 
LivingSocial). Therefore, SP can be applied earlier in Living- 



Social than in Groupon. However, as shown in Figure [9je) 
and Figure [TOT e'l. the relative error measured on the test set 
decreases rapidly for Groupon, while for LivingSocial the 
prediction converges more slowly to the actual value. After 
17 hours, the expected relative error obtained when esti- 
mating one-day purchases of a deal by using SP is about 
20%, while the same relative error is attained 13 hours after 
a Groupon deal is launched. This is due to the fact that 
novelty decay is faster in Groupon than in LivingSocial, i.e. 
it takes 7 hours in Groupon to reach the saturating point; 
while it takes about 14 hours in LivingSocial to reach the 
saturating point in Figure [7] So it is easier to predict the 
one-day purchases of Groupon deals with fewer hours of ob- 
servations (after tipping). One possible explanation of this 
is that the tipping point incentive mechanism for propagat- 
ing deals in Groupon disappears after the tipping point has 
been reached. In LivingSocial, on the other hand, the incen- 
tive to propagate a deal is always present for at least some 
users and furthermore the individual gain of propagating is 
greater. 



6. CONCLUSIONS 

In this paper, we presented a study of the group purchas- 
ing behavior of daily deals in Groupon and LivingSocial and 
introduced a predictive dynamic model of collective atten- 
tion for group buying behavior. Using large data sets from 
both Groupon and LivingSocial we showed how the model 
was able to predict the popularity of group deals as a func- 
tion of time. Our main finding is that the different incentive 
mechanisms in Groupon and LivingSocial lead to different 
propagation behavior, which in turn leads to differences in 
predictability. However, the basic stochastic processes as 
well as the distributional parameters of growth and decay 
are strikingly similar. Given that Groupon no longer pro- 



[Deal Title: Coastal Contacts] $60 to Spend on Prescription Eyeglasses (Now $19) 


Model 


Real purchases 


Predicted purchases 


Relative error 


baselinel with 12-hour observation 


129 


32 


0.75 


baseline2 with 12-hour observation 


129 


245 


0.90 


SP with 12-hour observation 


129 


110 


0.14 


Deal Title: Dawgs!] $10 (Pay $5) or $20 (Pay $10) to Spend on Food and Drink 


Model 


Real purchases 


Predicted purchases 


Relative error 


baselinel with 12-hour observation 


75 


28 


0.63 


baseline2 with 12-hour observation 


75 


147 


0.96 


SP with 12-hour observation 


75 


110 


0.47 



Table 3: Example prediction results for LivingSocial deals. 



vides detailed statistics of purchases over time, the models 
presented here can not easily be applied by any observer. 
However, both deal site owners and merchants should be 
able to benefit from analyzing the early stream of purchases 
using the models presented here. Our work also gives some 
insights into how different incentive mechanisms can affect 
the longevity of propagation momentum. These insights 
could be exploited in local marketing campaigns where viral 
and social dissemination of offers is desirable. 
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